This database contains wordlists collected as part of the Daghestanian loans project by the Linguistic Convergence Laboratory at NRU HSE. The aim of the 160-item shortlist, which is based on the World Loanword Database questionnaire, is to measure lexical contact on a micro-level. In other words, to quantify lexical convergence among the speech communities of minority languages on a village-level, and to detect fine-grained areal patterns beyond general observations on the spheres of influence of certain languages.
Contents:
## [,1]
## target_words 25796
## languages 23
If you use data from the database in your research, please cite as follows:
Chechuro I., Daniel M., Dobrushina N., and Verhees S. 2019. Daghestanian loans database. Linguistic Convergence Laboratory, HSE. (Available online at https://lingconlab.github.io/Dagloan_database/DL_database.html, , accessed on April 01, 2019.)
For now, the table shows source Concepts and target Words. Each target word is grouped in a similarity Set - a set of words that have the same meaning and look similar. In the future, data will be added on borrowing sources. Metadata includes the name of the Village where the word was recorded, the administrative District it is part of, the Language spoken there, and the List ID: these ID’s correspond to a particular speaker or in some cases a written source like a dictionary. Data is accessible at: Github/LingConLab/DagloanDatabase.
The dataset in the dummy format is available here.
Version: 2019-04-01. For questions or comments contact jh.verhees@gmail.com.
Hover over and / or click on a dot on the map to know more. The color of the dots corresponds to the number of lists collected in a village. Orange = dictionary data.
The map below shows the distribution of different stems for the concept ‘bucket’. In most cases the Russian word vedro is borrowed.